This is an R Markdown Notebook

Your project - Detect Anomaly in Industrial Process with Deep Learning

Deep Learning… In this lecture I would like to esplore with you the DeepLearning we can do with H2O package in R…

Data Exploration

library(plotly)

Attaching package: 㤼㸱plotly㤼㸲

The following object is masked from 㤼㸱package:ggplot2㤼㸲:

    last_plot

The following object is masked from 㤼㸱package:stats㤼㸲:

    filter

The following object is masked from 㤼㸱package:graphics㤼㸲:

    layout

About the NORMAL dataset

NORM_2016 <- DF_NORMAL %>% select(2:11) %>% as.matrix() 
summary(NORM_2016)

This dataset was selected .. after .. the period of deep process maintenance

plot_ly(z = NORM_2016, type = "surface")

About the TEST dataset

This dataset contains some values that may indicate potential anomaly

TEST_2015 <- DF_TEST %>% select(2:11) %>% as.matrix()
plot_ly(z = TEST_2015, type = "surface")

About the ANOMALY dataset

This dataset contains some values that indicate the anomaly


TEST_2017 <- DF_ANOMALY %>% select(2:11) %>% as.matrix()
plot_ly(z = TEST_2017, type = "surface")

Summary: 3 dataset were selected!

Train Deep Learning Model

# ?h2o.deeplearning
normality_model <- h2o.deeplearning(x = names(train), 
                                     model_id = "DeepLearning_id20180317",
                                     training_frame = train, 
                                     activation = "Tanh", 
                                     autoencoder = TRUE, 
                                     hidden = c(8,5,8), 
                                     sparse = TRUE,
                                     l1 = 1e-4, 
                                     epochs = 100)

  |                                                                                                          
  |                                                                                                    |   0%
  |                                                                                                          
  |========================================                                                            |  40%
  |                                                                                                          
  |================================================================================                    |  80%
  |                                                                                                          
  |====================================================================================================| 100%

reconstruct dataset

Check MSE

# shutdown JVM
h2o.shutdown(prompt = F)
[1] TRUE

Homework:

  • try different activation function, each time calculate MSE
  • try sparse false parameter, what is changing?
  • try high and very high complexity of the model, calculate MSE, conclude which one is the best
  • try 10:100:10 is it work?
  • try 200:200 will it work? what is the resulting MSE
  • try 5 hidden layers
  • try replicate_training data = False, how much time is different?
  • weights_column this is the observation weight, try to add 1 column with observation weights of importance! see help

Ways to productionize the model

library(h2o)
# load the model
loaded_model <- h2o.loadModel("DeepLearning_id20180317")
LS0tDQp0aXRsZTogIkNyZWF0aW5nIEFub21hbHkgRGV0ZWN0aW9uIFN5c3RlbSBmb3IgSW5kdXN0cmlhbCBQcm9jZXNzIHdpdGggRGVlcCBMZWFybmluZyINCm91dHB1dDoNCiAgaHRtbF9kb2N1bWVudDogZGVmYXVsdA0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIHBkZl9kb2N1bWVudDogZGVmYXVsdA0KLS0tDQoNClRoaXMgaXMgYW4gW1IgTWFya2Rvd25dKGh0dHA6Ly9ybWFya2Rvd24ucnN0dWRpby5jb20pIE5vdGVib29rDQoNCg0KIyMjIFlvdXIgcHJvamVjdCAtIERldGVjdCBBbm9tYWx5IGluIEluZHVzdHJpYWwgUHJvY2VzcyB3aXRoIERlZXAgTGVhcm5pbmcNCg0KRGVlcCBMZWFybmluZy4uLiBJbiB0aGlzIGxlY3R1cmUgSSB3b3VsZCBsaWtlIHRvIGVzcGxvcmUgd2l0aCB5b3UgdGhlIERlZXBMZWFybmluZyB3ZSBjYW4gZG8gd2l0aCBIMk8gcGFja2FnZSBpbiBSLi4uDQoNCiMjIyBEYXRhIEV4cGxvcmF0aW9uDQoNCmBgYHtyfQ0KIyBsaWJyYXJpZXMNCmxpYnJhcnkodGlkeXZlcnNlKQ0KbGlicmFyeShwbG90bHkpDQoNCiMgZGF0YSByZWFkaW5nDQpERl9OT1JNQUwgPC0gcmVhZF9yZHMoIkRBVEEtbm9ybWFsLnJkcyIpDQpERl9URVNUIDwtIHJlYWRfcmRzKCJEQVRBLXRlc3QucmRzIikNCkRGX0FOT01BTFkgPC0gcmVhZF9yZHMoIkRBVEEtYW5vbWFseS5yZHMiKQ0KDQojIG5hbWVzIG9mIHRoZSBkYXRhc2V0IGNvbHVtbnMNCm5hbWVzKERGX05PUk1BTCkNCmBgYA0KDQojIyMgQWJvdXQgdGhlIE5PUk1BTCBkYXRhc2V0DQoNCmBgYHtyfQ0KTk9STV8yMDE2IDwtIERGX05PUk1BTCAlPiUgc2VsZWN0KDI6MTEpICU+JSBhcy5tYXRyaXgoKSANCnN1bW1hcnkoTk9STV8yMDE2KQ0KDQpgYGANCg0KVGhpcyBkYXRhc2V0IHdhcyBzZWxlY3RlZCAuLiBhZnRlciAuLiB0aGUgcGVyaW9kIG9mIGRlZXAgcHJvY2VzcyBtYWludGVuYW5jZQ0KDQpgYGB7cn0NCnBsb3RfbHkoeiA9IE5PUk1fMjAxNiwgdHlwZSA9ICJzdXJmYWNlIikNCmBgYA0KDQojIyMgQWJvdXQgdGhlIFRFU1QgZGF0YXNldA0KDQpUaGlzIGRhdGFzZXQgY29udGFpbnMgc29tZSB2YWx1ZXMgdGhhdCBtYXkgaW5kaWNhdGUgcG90ZW50aWFsIGFub21hbHkNCg0KYGBge3J9DQpURVNUXzIwMTUgPC0gREZfVEVTVCAlPiUgc2VsZWN0KDI6MTEpICU+JSBhcy5tYXRyaXgoKQ0KcGxvdF9seSh6ID0gVEVTVF8yMDE1LCB0eXBlID0gInN1cmZhY2UiKQ0KDQpgYGANCg0KIyMjIEFib3V0IHRoZSBBTk9NQUxZIGRhdGFzZXQNCg0KVGhpcyBkYXRhc2V0IGNvbnRhaW5zIHNvbWUgdmFsdWVzIHRoYXQgaW5kaWNhdGUgdGhlIGFub21hbHkNCg0KYGBge3J9DQoNClRFU1RfMjAxNyA8LSBERl9BTk9NQUxZICU+JSBzZWxlY3QoMjoxMSkgJT4lIGFzLm1hdHJpeCgpDQpwbG90X2x5KHogPSBURVNUXzIwMTcsIHR5cGUgPSAic3VyZmFjZSIpDQoNCmBgYA0KDQpTdW1tYXJ5OiAzIGRhdGFzZXQgd2VyZSBzZWxlY3RlZCENCg0KDQoNCg0KDQoNCg0KDQojIyMjIFRyYWluIERlZXAgTGVhcm5pbmcgTW9kZWwNCg0KDQoNCg0KDQpgYGB7cn0NCmxpYnJhcnkoaDJvKQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KHBsb3RseSkNCmgyby5pbml0KG50aHJlYWRzID0gMikNCg0KdHJhaW4gPC0gYXMuaDJvKHggPSBOT1JNXzIwMTYsIGRlc3RpbmF0aW9uX2ZyYW1lID0gInRyYWluIikNCnRlc3QgPC0gYXMuaDJvKHggPSBURVNUXzIwMTUsIGRlc3RpbmF0aW9uX2ZyYW1lID0gInRlc3QiKQ0KYW5vbWFseSA8LSBhcy5oMm8oeCA9IFRFU1RfMjAxNywgZGVzdGluYXRpb25fZnJhbWUgPSAiYW5vbWFseSIpDQoNCiMgP2gyby5kZWVwbGVhcm5pbmcNCg0Kbm9ybWFsaXR5X21vZGVsIDwtIGgyby5kZWVwbGVhcm5pbmcoeCA9IG5hbWVzKHRyYWluKSwgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbW9kZWxfaWQgPSAiRGVlcExlYXJuaW5nX2lkMjAxODAzMTciLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRyYWluaW5nX2ZyYW1lID0gdHJhaW4sIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGFjdGl2YXRpb24gPSAiVGFuaCIsIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGF1dG9lbmNvZGVyID0gVFJVRSwgDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgaGlkZGVuID0gYyg4LDUsOCksIA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHNwYXJzZSA9IFRSVUUsDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbDEgPSAxZS00LCANCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBlcG9jaHMgPSAxMDApDQoNCiMgc2F2ZSB0aGlzIG1vZGVsIA0KaDJvLnNhdmVNb2RlbChub3JtYWxpdHlfbW9kZWwsIGdldHdkKCkpDQoNCmBgYA0KDQojIyMjIHJlY29uc3RydWN0IGRhdGFzZXQNCg0KYGBge3J9DQojIHJlY3JlYXRlIA0KdGVzdF9yZWNvbiA8LSBoMm8ucHJlZGljdChub3JtYWxpdHlfbW9kZWwsIHRyYWluKSAlPiUgYXMubWF0cml4KCkNCnBsb3RfbHkoeiA9IHRlc3RfcmVjb24sIHR5cGUgPSAic3VyZmFjZSIpDQpgYGANCg0KIyMjIyBDaGVjayBNU0UNCg0KDQpgYGB7cn0NCg0KbXNlX25vcm0gPC0gaDJvLmFub21hbHkobm9ybWFsaXR5X21vZGVsLCB0cmFpbikgJT4lIGFzLmRhdGEuZnJhbWUoKQ0KbXNlX3Rlc3QgPC0gaDJvLmFub21hbHkobm9ybWFsaXR5X21vZGVsLCB0ZXN0KSAlPiUgYXMuZGF0YS5mcmFtZSgpDQptc2VfYW5vbSA8LSBoMm8uYW5vbWFseShub3JtYWxpdHlfbW9kZWwsIGFub21hbHkpICU+JSBhcy5kYXRhLmZyYW1lKCkNCg0KbXNlX25vcm0kbGFiZWwgPC0gIm5vcm1hbCINCm1zZV90ZXN0JGxhYmVsIDwtICJ0ZXN0Ig0KbXNlam9pbmVkIDwtIHJiaW5kKG1zZV9ub3JtLCBtc2VfdGVzdCkNCm1zZWpvaW5lZCRpbmRleCA8LSAxOm5yb3cobXNlam9pbmVkKQ0KIyBwbG90X2x5KHogPSBURVNUXzIwMTUsIHR5cGUgPSAic3VyZmFjZSIpDQpnZ3Bsb3QobXNlam9pbmVkLCBhZXMoeCA9IGluZGV4LCB5ID0gUmVjb25zdHJ1Y3Rpb24uTVNFLCBjb2wgPSBhcy5mYWN0b3IobGFiZWwpKSkgKyBnZW9tX2xpbmUoKQ0KDQpwbG90LnRzKG1zZV9ub3JtKQ0KcGxvdC50cyhtc2VfdGVzdCkNCnBsb3QudHMobXNlX2Fub20pDQoNCg0KbXNlX2Fub20kbGFiZWwgPC0gImFub21hbHkiDQptc2VfYWxsIDwtIHJiaW5kKG1zZV9ub3JtLG1zZV90ZXN0LCBtc2VfYW5vbSkNCm1zZV9hbGwkaW5kZXggPC0gMTpucm93KG1zZV9hbGwpDQpnZ3Bsb3QobXNlX2FsbCwgYWVzKHggPSBpbmRleCwgeSA9IFJlY29uc3RydWN0aW9uLk1TRSwgY29sID0gYXMuZmFjdG9yKGxhYmVsKSkpICsgZ2VvbV9saW5lKCkNCg0KYGBgDQoNCg0KYGBge3J9DQojIHNodXRkb3duIEpWTQ0KaDJvLnNodXRkb3duKHByb21wdCA9IEYpDQpgYGANCg0KDQpIb21ld29yazoNCg0KLSB0cnkgZGlmZmVyZW50IGFjdGl2YXRpb24gZnVuY3Rpb24sIGVhY2ggdGltZSBjYWxjdWxhdGUgTVNFDQotIHRyeSBzcGFyc2UgZmFsc2UgcGFyYW1ldGVyLCB3aGF0IGlzIGNoYW5naW5nPw0KLSB0cnkgaGlnaCBhbmQgdmVyeSBoaWdoIGNvbXBsZXhpdHkgb2YgdGhlIG1vZGVsLCBjYWxjdWxhdGUgTVNFLCBjb25jbHVkZSB3aGljaCBvbmUgaXMgdGhlIGJlc3QNCi0gdHJ5IDEwOjEwMDoxMCBpcyBpdCB3b3JrPw0KLSB0cnkgMjAwOjIwMCB3aWxsIGl0IHdvcms/IHdoYXQgaXMgdGhlIHJlc3VsdGluZyBNU0UNCi0gdHJ5IDUgaGlkZGVuIGxheWVycw0KLSB0cnkgcmVwbGljYXRlX3RyYWluaW5nIGRhdGEgPSBGYWxzZSwgaG93IG11Y2ggdGltZSBpcyBkaWZmZXJlbnQ/DQotIHdlaWdodHNfY29sdW1uIHRoaXMgaXMgdGhlIG9ic2VydmF0aW9uIHdlaWdodCwgdHJ5IHRvIGFkZCAxIGNvbHVtbiB3aXRoIG9ic2VydmF0aW9uIHdlaWdodHMgb2YgaW1wb3J0YW5jZSEgc2VlIGhlbHANCg0KDQoNCiMjIyBXYXlzIHRvIHByb2R1Y3Rpb25pemUgdGhlIG1vZGVsDQoNCmBgYHtyfQ0KbGlicmFyeShoMm8pDQojIGxvYWQgdGhlIG1vZGVsDQpsb2FkZWRfbW9kZWwgPC0gaDJvLmxvYWRNb2RlbCgiRGVlcExlYXJuaW5nX2lkMjAxODAzMTciKQ0KDQpgYGANCg0K